Overview

Dataset statistics

Number of variables11
Number of observations1048575
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory88.0 MiB
Average record size in memory88.0 B

Variable types

Numeric8
Categorical3

Alerts

isFlaggedFraud has constant value ""Constant
nameOrig has a high cardinality: 1048317 distinct valuesHigh cardinality
nameDest has a high cardinality: 449635 distinct valuesHigh cardinality
amount is highly overall correlated with oldbalanceDest and 1 other fieldsHigh correlation
oldbalanceOrg is highly overall correlated with newbalanceOrigHigh correlation
newbalanceOrig is highly overall correlated with oldbalanceOrgHigh correlation
oldbalanceDest is highly overall correlated with amount and 1 other fieldsHigh correlation
newbalanceDest is highly overall correlated with amount and 1 other fieldsHigh correlation
isFraud is highly skewed (γ1 = 30.2521979)Skewed
nameOrig is uniformly distributedUniform
oldbalanceOrg has 342214 (32.6%) zerosZeros
newbalanceOrig has 580275 (55.3%) zerosZeros
oldbalanceDest has 437134 (41.7%) zerosZeros
newbalanceDest has 406914 (38.8%) zerosZeros
isFraud has 1047433 (99.9%) zerosZeros
isFlaggedFraud has 1048575 (100.0%) zerosZeros

Reproduction

Analysis started2023-04-20 15:36:57.181278
Analysis finished2023-04-20 15:40:06.719225
Duration3 minutes and 9.54 seconds
Software versionydata-profiling vv4.0.0
Download configurationconfig.json

Variables

step
Real number (ℝ)

Distinct95
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.966174
Minimum1
Maximum95
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.0 MiB
2023-04-20T15:40:07.254641image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile9
Q115
median20
Q339
95-th percentile45
Maximum95
Range94
Interquartile range (IQR)24

Descriptive statistics

Standard deviation15.623252
Coefficient of variation (CV)0.57936478
Kurtosis3.4335532
Mean26.966174
Median Absolute Deviation (MAD)11
Skewness1.2944546
Sum28276056
Variance244.08599
MonotonicityIncreasing
2023-04-20T15:40:08.403006image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
19 51352
 
4.9%
18 49579
 
4.7%
43 45060
 
4.3%
15 44609
 
4.3%
17 43361
 
4.1%
16 42471
 
4.1%
14 41485
 
4.0%
42 41304
 
3.9%
20 40625
 
3.9%
36 39774
 
3.8%
Other values (85) 608955
58.1%
ValueCountFrequency (%)
1 2708
 
0.3%
2 1014
 
0.1%
3 552
 
0.1%
4 565
 
0.1%
5 665
 
0.1%
6 1660
 
0.2%
7 6837
 
0.7%
8 21097
2.0%
9 37628
3.6%
10 35991
3.4%
ValueCountFrequency (%)
95 2980
 
0.3%
94 10372
1.0%
93 4444
0.4%
92 10
 
< 0.1%
91 8
 
< 0.1%
90 16
 
< 0.1%
89 6
 
< 0.1%
88 8
 
< 0.1%
87 6
 
< 0.1%
86 18
 
< 0.1%

type
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size8.0 MiB
CASH_OUT
373641 
PAYMENT
353873 
CASH_IN
227130 
TRANSFER
86753 
DEBIT
 
7178

Length

Max length8
Median length7
Mean length7.4253754
Min length5

Characters and Unicode

Total characters7786063
Distinct characters18
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPAYMENT
2nd rowPAYMENT
3rd rowTRANSFER
4th rowCASH_OUT
5th rowPAYMENT

Common Values

ValueCountFrequency (%)
CASH_OUT 373641
35.6%
PAYMENT 353873
33.7%
CASH_IN 227130
21.7%
TRANSFER 86753
 
8.3%
DEBIT 7178
 
0.7%

Length

2023-04-20T15:40:09.322770image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-20T15:40:09.980987image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
cash_out 373641
35.6%
payment 353873
33.7%
cash_in 227130
21.7%
transfer 86753
 
8.3%
debit 7178
 
0.7%

Most occurring characters

ValueCountFrequency (%)
A 1041397
13.4%
T 821445
10.6%
S 687524
8.8%
N 667756
8.6%
C 600771
 
7.7%
H 600771
 
7.7%
_ 600771
 
7.7%
E 447804
 
5.8%
O 373641
 
4.8%
U 373641
 
4.8%
Other values (8) 1570542
20.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 7185292
92.3%
Connector Punctuation 600771
 
7.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 1041397
14.5%
T 821445
11.4%
S 687524
9.6%
N 667756
9.3%
C 600771
8.4%
H 600771
8.4%
E 447804
 
6.2%
O 373641
 
5.2%
U 373641
 
5.2%
Y 353873
 
4.9%
Other values (7) 1216669
16.9%
Connector Punctuation
ValueCountFrequency (%)
_ 600771
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7185292
92.3%
Common 600771
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 1041397
14.5%
T 821445
11.4%
S 687524
9.6%
N 667756
9.3%
C 600771
8.4%
H 600771
8.4%
E 447804
 
6.2%
O 373641
 
5.2%
U 373641
 
5.2%
Y 353873
 
4.9%
Other values (7) 1216669
16.9%
Common
ValueCountFrequency (%)
_ 600771
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7786063
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 1041397
13.4%
T 821445
10.6%
S 687524
8.8%
N 667756
8.6%
C 600771
 
7.7%
H 600771
 
7.7%
_ 600771
 
7.7%
E 447804
 
5.8%
O 373641
 
4.8%
U 373641
 
4.8%
Other values (8) 1570542
20.2%

amount
Real number (ℝ)

Distinct1009606
Distinct (%)96.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean158666.98
Minimum0.1
Maximum10000000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.0 MiB
2023-04-20T15:40:10.923101image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0.1
5-th percentile2097.002
Q112149.065
median76343.33
Q3213761.89
95-th percentile519460.35
Maximum10000000
Range9999999.9
Interquartile range (IQR)201612.83

Descriptive statistics

Standard deviation264940.93
Coefficient of variation (CV)1.6697925
Kurtosis96.805427
Mean158666.98
Median Absolute Deviation (MAD)70322.84
Skewness6.3741657
Sum1.6637422 × 1011
Variance7.0193697 × 1010
MonotonicityNot monotonic
2023-04-20T15:40:11.805885image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10000000 14
 
< 0.1%
706.25 6
 
< 0.1%
1711.67 5
 
< 0.1%
3172.71 5
 
< 0.1%
5838.16 5
 
< 0.1%
9217.19 5
 
< 0.1%
3279.19 5
 
< 0.1%
3216.8 5
 
< 0.1%
2432.1 5
 
< 0.1%
5909.55 5
 
< 0.1%
Other values (1009596) 1048515
> 99.9%
ValueCountFrequency (%)
0.1 1
< 0.1%
0.14 1
< 0.1%
0.2 1
< 0.1%
0.26 1
< 0.1%
0.3 1
< 0.1%
0.32 1
< 0.1%
0.37 1
< 0.1%
0.5 1
< 0.1%
0.52 1
< 0.1%
0.63 2
< 0.1%
ValueCountFrequency (%)
10000000 14
< 0.1%
9977761.05 2
 
< 0.1%
9887819.06 2
 
< 0.1%
9465988.82 2
 
< 0.1%
9345700.07 2
 
< 0.1%
9039246.82 2
 
< 0.1%
8931607.89 2
 
< 0.1%
8924971.59 2
 
< 0.1%
8594065.09 2
 
< 0.1%
7937954.2 2
 
< 0.1%

nameOrig
Categorical

HIGH CARDINALITY  UNIFORM 

Distinct1048317
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size8.0 MiB
C1214450722
 
2
C309111136
 
2
C1268675361
 
2
C720460198
 
2
C1109092856
 
2
Other values (1048312)
1048565 

Length

Max length11
Median length11
Mean length10.482179
Min length5

Characters and Unicode

Total characters10991351
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1048059 ?
Unique (%)> 99.9%

Sample

1st rowC1231006815
2nd rowC1666544295
3rd rowC1305486145
4th rowC840083671
5th rowC2048537720

Common Values

ValueCountFrequency (%)
C1214450722 2
 
< 0.1%
C309111136 2
 
< 0.1%
C1268675361 2
 
< 0.1%
C720460198 2
 
< 0.1%
C1109092856 2
 
< 0.1%
C545402485 2
 
< 0.1%
C1362689728 2
 
< 0.1%
C110179857 2
 
< 0.1%
C1467095135 2
 
< 0.1%
C2073023524 2
 
< 0.1%
Other values (1048307) 1048555
> 99.9%

Length

2023-04-20T15:40:13.171384image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
c1214450722 2
 
< 0.1%
c443816828 2
 
< 0.1%
c645536800 2
 
< 0.1%
c301090204 2
 
< 0.1%
c563955235 2
 
< 0.1%
c150158085 2
 
< 0.1%
c2038530463 2
 
< 0.1%
c1378765159 2
 
< 0.1%
c556791598 2
 
< 0.1%
c263263252 2
 
< 0.1%
Other values (1048307) 1048555
> 99.9%

Most occurring characters

ValueCountFrequency (%)
1 1450056
13.2%
C 1048575
9.5%
2 1011021
9.2%
3 939627
8.5%
4 938760
8.5%
6 935202
8.5%
5 935188
8.5%
0 933624
8.5%
7 933248
8.5%
9 933108
8.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 9942776
90.5%
Uppercase Letter 1048575
 
9.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1450056
14.6%
2 1011021
10.2%
3 939627
9.5%
4 938760
9.4%
6 935202
9.4%
5 935188
9.4%
0 933624
9.4%
7 933248
9.4%
9 933108
9.4%
8 932942
9.4%
Uppercase Letter
ValueCountFrequency (%)
C 1048575
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 9942776
90.5%
Latin 1048575
 
9.5%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1450056
14.6%
2 1011021
10.2%
3 939627
9.5%
4 938760
9.4%
6 935202
9.4%
5 935188
9.4%
0 933624
9.4%
7 933248
9.4%
9 933108
9.4%
8 932942
9.4%
Latin
ValueCountFrequency (%)
C 1048575
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10991351
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1450056
13.2%
C 1048575
9.5%
2 1011021
9.2%
3 939627
8.5%
4 938760
8.5%
6 935202
8.5%
5 935188
8.5%
0 933624
8.5%
7 933248
8.5%
9 933108
8.5%

oldbalanceOrg
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct391033
Distinct (%)37.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean874009.54
Minimum0
Maximum38900000
Zeros342214
Zeros (%)32.6%
Negative0
Negative (%)0.0%
Memory size8.0 MiB
2023-04-20T15:40:13.943326image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median16002
Q3136642.02
95-th percentile6007521
Maximum38900000
Range38900000
Interquartile range (IQR)136642.02

Descriptive statistics

Standard deviation2971750.6
Coefficient of variation (CV)3.4001351
Kurtosis30.877779
Mean874009.54
Median Absolute Deviation (MAD)16002
Skewness5.1242857
Sum9.1646456 × 1011
Variance8.8313014 × 1012
MonotonicityNot monotonic
2023-04-20T15:40:14.788064image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 342214
32.6%
10100000 433
 
< 0.1%
10300000 424
 
< 0.1%
10200000 421
 
< 0.1%
10900000 387
 
< 0.1%
10400000 379
 
< 0.1%
10700000 378
 
< 0.1%
10600000 376
 
< 0.1%
10500000 375
 
< 0.1%
11000000 337
 
< 0.1%
Other values (391023) 702851
67.0%
ValueCountFrequency (%)
0 342214
32.6%
0.67 1
 
< 0.1%
1 57
 
< 0.1%
1.7 1
 
< 0.1%
2 51
 
< 0.1%
2.36 1
 
< 0.1%
3 53
 
< 0.1%
4 54
 
< 0.1%
4.58 1
 
< 0.1%
4.98 1
 
< 0.1%
ValueCountFrequency (%)
38900000 1
< 0.1%
38600000 1
< 0.1%
38400000 2
< 0.1%
38300000 1
< 0.1%
38200000 1
< 0.1%
38000000 1
< 0.1%
37900000 1
< 0.1%
37500000 1
< 0.1%
37300000 1
< 0.1%
36700000 1
< 0.1%

newbalanceOrig
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct440792
Distinct (%)42.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean893808.9
Minimum0
Maximum38900000
Zeros580275
Zeros (%)55.3%
Negative0
Negative (%)0.0%
Memory size8.0 MiB
2023-04-20T15:40:15.680867image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3174599.99
95-th percentile6168354.9
Maximum38900000
Range38900000
Interquartile range (IQR)174599.99

Descriptive statistics

Standard deviation3008271.3
Coefficient of variation (CV)3.3656762
Kurtosis30.139652
Mean893808.9
Median Absolute Deviation (MAD)0
Skewness5.0604564
Sum9.3722567 × 1011
Variance9.0496964 × 1012
MonotonicityNot monotonic
2023-04-20T15:40:16.069263image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 580275
55.3%
10300000 450
 
< 0.1%
10100000 449
 
< 0.1%
10200000 435
 
< 0.1%
10900000 405
 
< 0.1%
10700000 400
 
< 0.1%
10400000 399
 
< 0.1%
10600000 391
 
< 0.1%
10500000 388
 
< 0.1%
11000000 356
 
< 0.1%
Other values (440782) 464627
44.3%
ValueCountFrequency (%)
0 580275
55.3%
0.67 1
 
< 0.1%
0.73 1
 
< 0.1%
1.17 1
 
< 0.1%
1.48 1
 
< 0.1%
1.52 1
 
< 0.1%
1.63 1
 
< 0.1%
1.7 1
 
< 0.1%
1.78 1
 
< 0.1%
2.01 1
 
< 0.1%
ValueCountFrequency (%)
38900000 2
< 0.1%
38600000 1
< 0.1%
38400000 2
< 0.1%
38300000 1
< 0.1%
38200000 1
< 0.1%
38000000 1
< 0.1%
37900000 1
< 0.1%
37500000 1
< 0.1%
37300000 1
< 0.1%
36700000 1
< 0.1%

nameDest
Categorical

Distinct449635
Distinct (%)42.9%
Missing0
Missing (%)0.0%
Memory size8.0 MiB
C985934102
 
98
C1286084959
 
96
C1590550415
 
89
C248609774
 
88
C665576141
 
87
Other values (449630)
1048117 

Length

Max length11
Median length11
Mean length10.479478
Min length4

Characters and Unicode

Total characters10988519
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique371364 ?
Unique (%)35.4%

Sample

1st rowM1979787155
2nd rowM2044282225
3rd rowC553264065
4th rowC38997010
5th rowM1230701703

Common Values

ValueCountFrequency (%)
C985934102 98
 
< 0.1%
C1286084959 96
 
< 0.1%
C1590550415 89
 
< 0.1%
C248609774 88
 
< 0.1%
C665576141 87
 
< 0.1%
C2083562754 86
 
< 0.1%
C977993101 82
 
< 0.1%
C1360767589 81
 
< 0.1%
C451111351 80
 
< 0.1%
C306206744 79
 
< 0.1%
Other values (449625) 1047709
99.9%

Length

2023-04-20T15:40:17.196856image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
c985934102 98
 
< 0.1%
c1286084959 96
 
< 0.1%
c1590550415 89
 
< 0.1%
c248609774 88
 
< 0.1%
c665576141 87
 
< 0.1%
c2083562754 86
 
< 0.1%
c977993101 82
 
< 0.1%
c1360767589 81
 
< 0.1%
c451111351 80
 
< 0.1%
c306206744 79
 
< 0.1%
Other values (449625) 1047709
99.9%

Most occurring characters

ValueCountFrequency (%)
1 1446578
13.2%
2 1011948
9.2%
8 939332
8.5%
3 938299
8.5%
4 937381
8.5%
0 935604
8.5%
7 934680
8.5%
6 932589
8.5%
5 932032
8.5%
9 931501
8.5%
Other values (2) 1048575
9.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 9939944
90.5%
Uppercase Letter 1048575
 
9.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1446578
14.6%
2 1011948
10.2%
8 939332
9.5%
3 938299
9.4%
4 937381
9.4%
0 935604
9.4%
7 934680
9.4%
6 932589
9.4%
5 932032
9.4%
9 931501
9.4%
Uppercase Letter
ValueCountFrequency (%)
C 694702
66.3%
M 353873
33.7%

Most occurring scripts

ValueCountFrequency (%)
Common 9939944
90.5%
Latin 1048575
 
9.5%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1446578
14.6%
2 1011948
10.2%
8 939332
9.5%
3 938299
9.4%
4 937381
9.4%
0 935604
9.4%
7 934680
9.4%
6 932589
9.4%
5 932032
9.4%
9 931501
9.4%
Latin
ValueCountFrequency (%)
C 694702
66.3%
M 353873
33.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10988519
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1446578
13.2%
2 1011948
9.2%
8 939332
8.5%
3 938299
8.5%
4 937381
8.5%
0 935604
8.5%
7 934680
8.5%
6 932589
8.5%
5 932032
8.5%
9 931501
8.5%
Other values (2) 1048575
9.5%

oldbalanceDest
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct590110
Distinct (%)56.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean978160.05
Minimum0
Maximum42100000
Zeros437134
Zeros (%)41.7%
Negative0
Negative (%)0.0%
Memory size8.0 MiB
2023-04-20T15:40:18.097282image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median126377.21
Q3915923.47
95-th percentile4686136.4
Maximum42100000
Range42100000
Interquartile range (IQR)915923.47

Descriptive statistics

Standard deviation2296780.4
Coefficient of variation (CV)2.3480619
Kurtosis42.638314
Mean978160.05
Median Absolute Deviation (MAD)126377.21
Skewness5.3731949
Sum1.0256742 × 1012
Variance5.2752002 × 1012
MonotonicityNot monotonic
2023-04-20T15:40:19.195184image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 437134
41.7%
10100000 314
 
< 0.1%
10300000 304
 
< 0.1%
10200000 295
 
< 0.1%
10900000 295
 
< 0.1%
10800000 291
 
< 0.1%
10500000 291
 
< 0.1%
10600000 284
 
< 0.1%
10700000 276
 
< 0.1%
10400000 265
 
< 0.1%
Other values (590100) 608826
58.1%
ValueCountFrequency (%)
0 437134
41.7%
0.37 1
 
< 0.1%
1 6
 
< 0.1%
2 7
 
< 0.1%
2.94 1
 
< 0.1%
3 1
 
< 0.1%
3.11 1
 
< 0.1%
4 4
 
< 0.1%
5 4
 
< 0.1%
6 4
 
< 0.1%
ValueCountFrequency (%)
42100000 1
 
< 0.1%
41500000 1
 
< 0.1%
41400000 2
< 0.1%
41300000 4
< 0.1%
41100000 3
< 0.1%
41000000 1
 
< 0.1%
40900000 1
 
< 0.1%
39900000 1
 
< 0.1%
39000000 3
< 0.1%
38900000 1
 
< 0.1%

newbalanceDest
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct437054
Distinct (%)41.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1114198
Minimum0
Maximum42200000
Zeros406914
Zeros (%)38.8%
Negative0
Negative (%)0.0%
Memory size8.0 MiB
2023-04-20T15:40:20.080579image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median218260.36
Q31149807.5
95-th percentile5140096.1
Maximum42200000
Range42200000
Interquartile range (IQR)1149807.5

Descriptive statistics

Standard deviation2416593.1
Coefficient of variation (CV)2.1689082
Kurtosis37.4222
Mean1114198
Median Absolute Deviation (MAD)218260.36
Skewness5.0124557
Sum1.1683201 × 1012
Variance5.8399223 × 1012
MonotonicityNot monotonic
2023-04-20T15:40:20.802835image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 406914
38.8%
10200000 361
 
< 0.1%
10900000 350
 
< 0.1%
10500000 348
 
< 0.1%
10100000 343
 
< 0.1%
10400000 327
 
< 0.1%
10300000 324
 
< 0.1%
10800000 310
 
< 0.1%
11000000 304
 
< 0.1%
10600000 292
 
< 0.1%
Other values (437044) 638702
60.9%
ValueCountFrequency (%)
0 406914
38.8%
0.33 1
 
< 0.1%
2.94 1
 
< 0.1%
3.11 1
 
< 0.1%
7.7 1
 
< 0.1%
9.69 1
 
< 0.1%
10.98 1
 
< 0.1%
12.1 4
 
< 0.1%
12.82 6
 
< 0.1%
13.47 1
 
< 0.1%
ValueCountFrequency (%)
42200000 1
 
< 0.1%
42100000 1
 
< 0.1%
41500000 1
 
< 0.1%
41400000 3
< 0.1%
41300000 5
< 0.1%
41100000 2
 
< 0.1%
40900000 1
 
< 0.1%
39900000 2
 
< 0.1%
39000000 2
 
< 0.1%
38900000 1
 
< 0.1%

isFraud
Real number (ℝ)

SKEWED  ZEROS 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.0010890971
Minimum0
Maximum1
Zeros1047433
Zeros (%)99.9%
Negative0
Negative (%)0.0%
Memory size8.0 MiB
2023-04-20T15:40:21.532002image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum1
Range1
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.032983511
Coefficient of variation (CV)30.285189
Kurtosis913.19722
Mean0.0010890971
Median Absolute Deviation (MAD)0
Skewness30.252198
Sum1142
Variance0.001087912
MonotonicityNot monotonic
2023-04-20T15:40:22.120609image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=2)
ValueCountFrequency (%)
0 1047433
99.9%
1 1142
 
0.1%
ValueCountFrequency (%)
0 1047433
99.9%
1 1142
 
0.1%
ValueCountFrequency (%)
1 1142
 
0.1%
0 1047433
99.9%

isFlaggedFraud
Real number (ℝ)

CONSTANT  ZEROS 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0
Minimum0
Maximum0
Zeros1048575
Zeros (%)100.0%
Negative0
Negative (%)0.0%
Memory size8.0 MiB
2023-04-20T15:40:22.500364image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum0
Range0
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0
Coefficient of variation (CV)nan
Kurtosis0
Mean0
Median Absolute Deviation (MAD)0
Skewness0
Sum0
Variance0
MonotonicityIncreasing
2023-04-20T15:40:22.777544image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=1)
ValueCountFrequency (%)
0 1048575
100.0%
ValueCountFrequency (%)
0 1048575
100.0%
ValueCountFrequency (%)
0 1048575
100.0%

Interactions

2023-04-20T15:39:41.503284image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:38:50.725635image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:38:58.330360image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:07.686359image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:17.005844image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:24.476620image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:31.210108image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:36.072523image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:42.583525image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:38:52.005122image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:38:59.644696image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:08.721320image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:18.053092image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:25.188486image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:31.793597image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:36.791507image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:43.142550image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:38:52.611216image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:00.813087image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:09.760395image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:19.095231image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:26.572878image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:32.261052image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:37.361654image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:44.066282image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:38:53.812032image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:02.017195image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:10.901509image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:20.262957image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:27.472158image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:32.812923image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:37.990366image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:44.979511image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:38:54.733853image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:03.255320image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:12.166803image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:21.092304image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:28.283645image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:33.758372image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:38.729831image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:45.924752image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:38:55.279800image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:04.367343image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:13.505202image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:22.009392image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:29.248572image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:34.343480image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:39.439637image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:46.767897image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:38:56.271011image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:05.360399image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:14.709552image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:23.101592image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:30.046496image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:35.035408image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:40.176948image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:47.444301image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:38:56.988892image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:06.371383image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:15.883322image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:23.753740image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:30.643044image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:35.533605image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-04-20T15:39:40.830410image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Correlations

2023-04-20T15:40:23.212453image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
stepamountoldbalanceOrgnewbalanceOrigoldbalanceDestnewbalanceDestisFraudtype
step1.000-0.036-0.024-0.0220.007-0.0020.0260.048
amount-0.0361.0000.030-0.0930.6030.6720.0280.216
oldbalanceOrg-0.0240.0301.0000.8150.009-0.0260.0310.262
newbalanceOrig-0.022-0.0930.8151.0000.022-0.113-0.0270.266
oldbalanceDest0.0070.6030.0090.0221.0000.925-0.0160.093
newbalanceDest-0.0020.672-0.026-0.1130.9251.000-0.0050.109
isFraud0.0260.0280.031-0.027-0.016-0.0051.0000.054
type0.0480.2160.2620.2660.0930.1090.0541.000

Missing values

2023-04-20T15:39:49.834647image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-04-20T15:39:54.174992image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

steptypeamountnameOrigoldbalanceOrgnewbalanceOrignameDestoldbalanceDestnewbalanceDestisFraudisFlaggedFraud
01PAYMENT9839.64C1231006815170136.00160296.36M19797871550.00.0000
11PAYMENT1864.28C166654429521249.0019384.72M20442822250.00.0000
21TRANSFER181.00C1305486145181.000.00C5532640650.00.0010
31CASH_OUT181.00C840083671181.000.00C3899701021182.00.0010
41PAYMENT11668.14C204853772041554.0029885.86M12307017030.00.0000
51PAYMENT7817.71C9004563853860.0046042.29M5734872740.00.0000
61PAYMENT7107.77C154988899183195.00176087.23M4080691190.00.0000
71PAYMENT7861.64C1912850431176087.23168225.59M6333263330.00.0000
81PAYMENT4024.36C12650129282671.000.00M11769321040.00.0000
91DEBIT5337.77C71241012441720.0036382.23C19560086041898.040348.7900
steptypeamountnameOrigoldbalanceOrgnewbalanceOrignameDestoldbalanceDestnewbalanceDestisFraudisFlaggedFraud
104856595TRANSFER132387.24C165440284015956.510.00C1878219072631284.08763671.3200
104856695PAYMENT12598.15C56552385530601.0018002.85M17409806420.000.0000
104856795CASH_OUT279674.05C99025246918002.850.00C5744391651847488.282127162.3200
104856895PAYMENT20721.54C95426998649732.0029010.46M8126676440.000.0000
104856995PAYMENT3210.11C211326489711113.007902.89M19894795990.000.0000
104857095CASH_OUT132557.35C1179511630479803.00347245.65C435674507484329.37616886.7200
104857195PAYMENT9917.36C195616122590545.0080627.64M6683649420.000.0000
104857295PAYMENT14140.05C203796497520545.006404.95M13551829330.000.0000
104857395PAYMENT10020.05C163323735490605.0080584.95M19649924630.000.0000
104857495PAYMENT11450.03C126435644380584.9569134.92M6775774060.000.0000